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An MN-Gallager Code over Galois fields, q, based on the Dynamical Block Posterior probabilities 
(DBP) for messages with a given set of autocorrelations is presented with the following main results: 
(a) for a binary symmetric channel the threshold, f c , is extrapolated for infinite messages using the 
scaling relation for the median convergence time, i me d oc l/(/ c — /); (b) a degradation in the 
threshold is observed as the correlations are enhanced; (c) for a given set of autocorrelations the 
performance is enhanced as q is increased; (d) the efficiency of the DBP joint source-channel coding is 
slightly better than the standard gzip compression method; (e) for a given entropy, the performance 
of the DBP algorithm is a function of the decay of the correlation function over large distances. 



With the rapid growth of information content in today's wire and wireless communication, there is an increasing 
demand for efficient transmission systems. A significant gain in the transmission performance can be achieved by the 
application of the joint source-channel coding technique, which has attracted much attention during the recent past, 
see for instance [1-6] . Roughly speaking, source coding is mainly a data compression process that aims at removing as 
much redundancy as possible from the source signal, whereas channel coding is the process of intelligent redundancy 
insertion so as to be robust against channel noise. These two processes, source coding and channel coding, seem to 
act in opposition, where the first/second process shrinks/expands the transmitted data. For illustration, assume that 
our compression shrinks the size of the source signal by a factor 2 and in order to be robust against channel noise 
we have to expend our compressed file by a factor 4. Hence, the length of the transmitted sequence is only twice the 
length of the uncompressed source. 

The source-channel coding theorem of Shannon [7] indicates that if the minimal achievable source coding rate of a 
given source is below the capacity of the channel, then the source can be reliably transmitted through the channel, 
assuming an infinite source sequence. This theorem implies that source coding and channel coding can be treated 
separately without any loss of overall performance, hence they are fundamentally separable. Practically, the source 
can be first efficiently compressed and then an efficient error correction method can be used. 

The objective of joint source-channel coding is to combine both source (compression) and channel (error correction) 
coding into one mechanism in order to reduce the overall complexity of the communication while maintaining satis- 
factory performance. Another possible advantage of the joint source-channel coding is the reduction of the sensitivity 
to a bit error in a compressed message. 

In a recent paper [8] a particular scheme based on a statistical mechanical approach for the implementation of the 
joint source-channel coding was presented and the main steps are briefly summarized below. The original boolean 
source is first mapped to a binary source [9,10] {xi ± 1} i = 1,...,L, and is characterized by a finite set, ko, of 
autocorrelations 

1 L 

Ck = J ^ XjX( i+ k) mod L (1) 
i=l 

where k < k$ is the highest autocorrelation taken. The number of sequences oeying these fco constraints is given by 

\ xjXj+k - C k L (2) 




Introducing the Fourier representation of the delta functions and then using the standard transfer matrix (of size 
2 k ° x 2 k °) method, [11] one finds f2 = J dy% exp{— L\£ UkCk ^^ n \nax({]Jk})]}, where X max is the maximal eigenvalue 
of the corresponding transfer matrix. For large L, using the saddle point method the entropy, ^({Cfc}); is given in 
the leading order by 

ti2{{^k}) = [>>) 

where {y k } are determined from the saddle point equations of f2. [8] Assuming Binary Symmetric Channel (BSC) and 
using Shannon's lower bound, the channel capacity of sequences with a given set of ko autocorrelations is given by 
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where / is the channel bit error rate and pb is a bit error rate. The saddle point solutions derived from eq. 3 indicate 
that the equilibrium properties of the one dimensional Ising spin system 

H = - jr x i x i+k (5) 

obey in the leading order the autocorrelation constraints of cq. 2. Note that in the typical scenario of statistical 
mechanics, one of the main goals is to calculate the partition function and the equilibrium properties of a given 
Hamiltonian. Here we present a prescription of how to solve the reverse question. Given the desired macroscopic 
properties, the set of the autocorrelations, the goal is to find the appropriate Hamiltonian obeying these macroscopic 
constraints. This property of the effective Hamiltonian, cq. 5, is used in simulations below to generate an ensemble 
of signals (source messages) with the desired set of autocorrelations. 

The decoding of symbols of fco successive bits is based on the standard message passing introduced for the MN 
decoder over Galois fields with q = 2 k ° [12] and with the following modification. The horizontal pass is left unchanged, 
but a dynamical set of probabilities assigned for each block are used in the vertical pass. The Dynamical Block 
Probabilities (DBP), {P^}, are determined following the current belief regarding the neighboring blocks and are given 

by 



In = Sj (c) [^q l L S L (l,c)) QrSr (c, r) 
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where l/r/c denotes the state of the left/right/center (n— 1 / n+1 / n) block respectively and (Tl/Ir are their posterior 
probabilities. <Sj(c) = e~^ Hl , where Hi is the inner energy of a block of ko spins at a state c, see eq. 5. Similarly 
Sl(1,c) (Sr(c, r)) stands for the Gibbs factor of consecutive Left/Center (Center/Right) blocks at a state l,c (c,r). 
[8] 

Note that the complexity of the calculation of the block prior probabilities is 0{Lq 2 / log q) where L/\ogq is the 
number of blocks. The decoder complexity per iteration of the MN codes over a finite field q can be reduced to order 
O(Lqu) [13-15], where u stands for the average number of checks per block. Hence the total complexity of the DBP 
decoder is of the order of 0(Lqu + Lq 2 / log 9). 

In this Letter we examine the efficiency of the DBP-MN decoder as a function of the maximal correlation length 
taken, ko, the strength of the correlations, the size of the finite fields q, and we compare this efficiency with the 
standard gzip compression procedure. A direct answer to the above questions is to implement exhaustive simulations 
on increasing message length, various finite fields q, and sets of autocorrelations, which result in the bit error probability 
versus the flip rate /. Besides the enormous computational time required , the conclusions would be controversial since 
it is unclear how to compare, for instance, the performance as a function of q; with the same number of transmitted 
blocks or with the same number of transmitted bits. 

In order to overcome these difficulties, for a given MN-DBP code over GF(q) and a set of autocorrelations, the 
threshold f c is estimated from the scaling argument of the convergence time, which was previously observed for q = 2 
[16,17]. The median number of message passing steps, t me d, necessary for the convergence of the MN-DBP algorithm 
is assumed to diverge as the level of noise approaches f c from below. More precisely, we found that the scaling for 
the divergence of t me d is independent of q and is consistent with 

tmed ~Z 7 (7) 
Jc ~ J 

where for a given set of autocorrelations and q, A is a constant. Moreover, for a given set of autocorrelations and 
a finite field q, the extrapolated threshold f c is independent of L, as demonstrated in Fig. 1. This observation is 
essential to determine the threshold of a code based on the above scaling behavior. Note that the estimation of t me d 
is a simple computational task in comparison with the estimation of low bit error probabilities for large L, especially 
close to the threshold. We also note that the analysis is based on t me d instead of the average convergence time, t av , 
[16] since we wish to prevent the dramatic effect of a small fraction of finite samples with slow convergence or no 
convergence. [19,18] 
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FIG. 1. The flip rate / as a function of l/t med for GF(4) with Ci = C 2 = 0.8 and L = 1, 000, 5, 000 , 50, 000. The lines are 
a result of a linear regression fit. The threshold, f c ~ 0.272, extrapolated from the scaling behavior eq. 7, is independent of N. 



All simulation results presented below are derived for rate 1 /3 and the construction of the matrices A and B of the 
MN code are taken from [16]. In all examined sets of autocorrelations, 10 3 < L < 5 x 10 4 and 4 < q < 64, the scaling 
for the median convergence time was indeed recovered. At this stage an efficient tool to estimate the threshold of an 
MN-DBP decoder exists and we are ready to examine the efficiency of the DBP decoder as a function of {Ck} and q. 

Results of simulations for q = 4, 8, 16 and 32 and selected sets of autocorrelations are summarized in Table I, and 
the definition of the symbols is: {Ck} stand for the imposed values of autocorrelations as defined in eqs. 1-2; {yk} are 
the interaction strengths, eqs. 3 and 5; H represents the entropy of sequences with the given set of autocorrelations, 
eq. 2; f c is the estimated threshold of the MN-DBP decoder derived from the scaling behavior of t me d\ fsh is the 
Shannon's lower bound, eq. 4; Ratio is the efficiency of our code f c /fsh] Zr indicates the gzip compression rate 
averaged over files of the sizes 10 6 bits with the desired set of autocorrelations. We assume that the compression rate 
with L = 10 6 achieves its asymptotic ratio, as was indeed confirmed in the compression of files with different L; 1/R* 
indicates the ideal (minimal) ratio between the transmitted message and the source signal after implementing the 
following two steps: compression of the file using gzip and then using an ideal optimal encoder/decoder, for a given 
BSC with f c . A number greater than (less than) 3 in this column indicates that the MN-DBP joint source-channel 
coding algorithm is more efficient (less efficient) in comparison to the channel separation method using the standard 
gzip compression. The last four columns of Table I are devoted for the comparison of our DBP algorithm to advanced 
compression methods. PPMr and ACr stand for the compression rate of files of the size 10 6 bits with the desired 
autocorrelations using the Prediction by Partial Match [20] and for the Arithmetic Coder [21], respectively. Similarly 
to the gzip case, 1/Rppm and 1/Rac stand for the optimal (minimal) rate required for the separation process (first 
a compression and then an ideal optimal encoder/decoder) assuming a BSC with f c . 
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TABLE I. Results for q — 4, 8, 16, 32 and selected sets of autocorrelations {Ck}- 
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Table I indicates the following main results: (a) For q = 4 (the upper part of Table I) a degradation in the 
performance is observed as the correlations are enhanced and as a result the entropy decreases. The degradation 
seems to be significant as the entropy is below ~ 0.3 (or for the test case R = 1/3, f c > 0.3). [22] A similar 
degradation was observed as the entropy decreases for larger values of q. (b) The efficiency of our joint source-channel 
coding technique is superior to the alternative standard gzip compression in the source channel separation technique. 
For high entropy the gain of the MN-DBP is about 5 — 10%. This gain disappears as the entropy and the performance 
of the DBP algorithm are decreased, (c) In comparison to the standard gzip, the compression rate is improved by 
2 — 5% using the AC method. A further improvement of a few percents is achieved by the PPM compression. This 
later improvement seems to be significant in the event of low entropy, (d) Our DBP joint source-channel coding seems 
to be superior (by ~ 3%)to the separation method based on the PPM compression for high entropy. However for 
ensemble of sequences characterized by low entropy this gain disappears, (e) With respect to the computational time 
of the source channel coding, our limited experience indicates that the DBP joint-source channel coding is faster than 
the AC separation method and the PPM separation method is substantially slower. 

For a given set of autocorrelations where Ck is the maximal one taken, the MN-DBP algorithm can be implemented 
with any field q > 2 fe ° . If one wishes to optimize the complexity of the decoder it is clear that one has to work with the 
minimal allowed field, q = 2 fc ° . However, when the goal is to optimize the performance of the code and to maximize 
the threshold, the selection of the optimal field, q, is in question. In order to answer this question we present in Fig. 2 
results for fc = 2 (C\ =Ci = 0.86) and q = 4, 16, 64. It is clear that the threshold, f c , increases as a function of q as 
was previously found for the case of unbiased signals. [12] More precisely, the estimated thresholds for q = 4, 16, 64 are 
~ 0.293, 0.3, 0.309, respectively, and the corresponding Ratios (= f c / fsh) are 0.913, 0.934, 0.962. where fsh = 0.321. 
Note that the extrapolation of f c for large q seems asymptotically to be consistent with f c (q) ~ 0.316 — 0.18/q. 
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FIG. 2. The scaling behavior, / as a function of l/t med , for Ci = C2 = 0.86 and q — 4, 16, 64. The lines are a re- 
sult of a linear regression fit. The estimated thresholds for q = 4, 16, 64 are 0.293, 0.3, 0.309, and the corresponding 
Ratio ee fc/fsh = 0.913, 0.934, 0.962, where f Sh = 0.321. 

For a given q, there are many sets of autocorrelations (or a finite fraction of {Ck} in k — 1 dimensions) obeying 
the same entropy. An interesting question is whether the performance of our DBP algorithm measured by the Ratio 
(= f c / fsh) is a function of the entropy only. Our numerical simulations indicate that the entropy is not the only 
parameter which controls the performance of the DBP algorithm. For the same entropy and q the Ratio can fluctuate 
widely among different sets of correlations. For illustration, in Table II results for two sets of autocorrelations with 
the same entropy are summarized for each q = 4, 8, 16 and 32. It is clear that as the Ratio (= f c /fsh) is much 
degradated the gzip performance is superior (the second example with q — 8 and 32 in Table II where the Ratio is 0.8 
and 0.72, respectively). The crucial question is to find the criterion to classify the performance of the DBP algorithm 
among all sets of autocorrelations obeying the same entropy. Our generic criterion is the decay of the correlation 
function over distances beyond two successive blocks. However, before the examination of this criterion, we would like 
to turn back to some aspects of statistical physics. 

The entropy of sequences with given first ko correlations are determined via the effective Hamiltonian consisting 
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of ko interactions, eqs. 2-3. As a result the entropy of these sequences is the same as the entropy of the effective 
Hamiltonian, H{yk\, at the inverse temperature (3=1, eq. 5. As for the usual scenario of the transfer matrix method, 
the leading order of quantities such as the free energy and the entropy are a function of the largest eigenvalue of the 
transfer matrix only. On the other hand the decay of the correlation function is a function of the whole spectrum 
of the 2 ka eigenvalues. Asymptotically, the decay of the correlation function is determined from the ratio between 
the second largest eigenvalue and the largest eigenvalue, A2/A maa; . From the statistical mechanical point of view one 
may wonder, why the first ko correlations can be determined using the information of \ ma x only. The answer to this 
question is that once the transfer matrix is defined as a function of {j/fc}, eqs. 3-5, all eigenvalues are determined as 
well as X max - There is no way to determine Xmax independently of all other eigenvalues. 

In Table II results of the DBP-MN algorithm for q = 4, 8, 16, 32 are presented. For each q, two different sets of 
autocorrelations characterized by the same entropy and threshold fsh are examined. The practical method we used to 
generate different sets of autocorrelations with the same entropy was a simple Monte Carlo over the space of {C^}. [23] 
The additional column in Table II (in comparison with Table I) is the ratio between \2/X max , which characterizes the 
decay of the correlation function over large distances. Independent of q, it is clear that for a given entropy as \2/\ m ax 
increases/decreases the performance of the DBP algorithm measured by the Ratio f c / fsh is degradated/enhanced. 
The new criterion to classify the performance of the DBP algorithm among all sets of autocorrelations obeying the 
same entropy is the decay of the correlation function. This criterion is consistent with the tendency that as the first 
fc autocorrelations are increased/decreased a degradation/enhancement in the performance is observed (see Table 
I). The physical intuition is that as the correlation length increases, the relaxation time increases and flips on larger 
scales than nearest neighbor blocks are required. 

Note that the decay of the correlation function in the intermediate region of a small number of blocks is a function 
of all the 2 k ° eigenvalues. Hence, in order to enhance the effect of the fast decay of the correlation function in the 
case of small \2/X max , we also try to enforce in our Monte Carlo search that all other 2 k " — 2 eigenvalues be less than 
A\ m ax with the minimal possible constant A. This additional constraint was easily fulfilled for q = 4 with A = 0.1, 
but for q — 32 the minimal A was around 0.5. 

Finally we raise the question of whether for a given entropy a degradation in the typical performance of the DBP 
algorithm is expected as q increases. This is crucial since the superiority, if any, of the DBP joint source-channel 
coding method over advanced compression methods is in question. As explained above, our Monte Carlo simulations 
indicate that for a given entropy the suppression of the correlation function is more difficult as q increases. [23] This 
is a strong indication that as q increases a degradation in the typical performance of the DBP decoder is expected, 
but its nature and significance have still to be examined in further research. 

We thank Shlomo Shamai and David MacKay for valuable discussions. 
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TABLE II. Results for q — 4, 8, 16, 32 and different sets of autocorrelations. For each q, two different sets of autocorrelations 
characterized by the same entropy and threshold fsh are examined. As \2/Xmax increases/decreases the performance of the 
DBP algorithm measured by the Ratio f c /fsh is degradated/enhanced. 
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